Search CORE

Springer - Publisher Connector

Springer

Gene dispersion is the key determinant of the read count bias in differential expression analysis of RNA-seq data

Author: A Mortazavi
A Oshlack
A Roberts
A Subramanian
AR Barutcu
BR Graveley
C Trapnell
Cancer Genome Atlas Research N
CW Law
Dougu Nam
G Mi
H Edgren
H Li
J Li
JC Marioni
JH Bullard
JH Malone
L Gao
LM Shi
MA Dillies
MD Robinson
MD Robinson
MD Robinson
MD Young
MI Love
P Brennecke
Q Xiong
RO Vidal
S Anders
S Yoon
Sora Yoon
T Ching
TJ Hardcastle
U Nagalakshmi
W Zheng
Y Rahmatallah
YW Liu
Z Wang
ZY Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2017
Field of study

Background: In differential expression analysis of RNA-sequencing (RNA-seq) read count data for two sample groups, it is known that highly expressed genes (or longer genes) are more likely to be differentially expressed which is called read count bias (or gene length bias). This bias had great effect on the downstream Gene Ontology over-representation analysis. However, such a bias has not been systematically analyzed for different replicate types of RNA-seq data. Results: We show that the dispersion coefficient of a gene in the negative binomial modeling of read counts is the critical determinant of the read count bias (and gene length bias) by mathematical inference and tests for a number of simulated and real RNA-seq datasets. We demonstrate that the read count bias is mostly confined to data with small gene dispersions (e.g., technical replicates and some of genetically identical replicates such as cell lines or inbred animals), and many biological replicate data from unrelated samples do not suffer from such a bias except for genes with some small counts. It is also shown that the sample-permuting GSEA method yields a considerable number of false positives caused by the read count bias, while the preranked method does not. Conclusion: We showed the small gene variance (similarly, dispersion) is the main cause of read count bias (and gene length bias) for the first time and analyzed the read count bias for different replicate types of RNA-seq data and its effect on gene-set enrichment analysis

Public Library of Science (PLOS)

ScholarWorks@UNIST

Improving gene-set enrichment analysis of RNA-Seq data with small replicates

Author: A Liberzon
A Subramanian
BR Zeeberg
BS Carver
C Lee
C Trapnell
CW Law
CW Law
D Eddelbuettel
D Nam
D Nam
D Nam
D Nam
D Nam
D Wu
DC Koboldt
Dongmei Li
Dougu Nam
F Rapaport
GK Smyth
H Jiang
H Li
HL Li
J Li
J Li
JC Marioni
JH Bullard
JJ Goeman
JK Pickrell
JK Schwarz
JX Feng
KA Gray
MA Dillies
MA Newton
MD Robinson
MD Robinson
MD Robinson
MD Young
ME Ritchie
MI Love
Q Xiong
Q Xiong
S Anders
S Song
Seon-Young Kim
Sora Yoon
U Nagalakshmi
V Saxena
W Huang da
WT Barry
X Wang
X Wang
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 09/11/2016
Field of study

Deregulated pathways identified from transcriptome data of two sample groups have played a key role in many genomic studies. Gene-set enrichment analysis (GSEA) has been commonly used for pathway or functional analysis of microarray data, and it is also being applied to RNA-seq data. However, most RNA-seq data so far have only small replicates. This enforces to apply the gene-permuting GSEA method (or preranked GSEA) which results in a great number of false positives due to the inter-gene correlation in each gene-set. We demonstrate that incorporating the absolute gene statistic in one-tailed GSEA considerably improves the false-positive control and the overall discriminatory ability of the gene-permuting GSEA methods for RNA-seq data. To test the performance, a simulation method to generate correlated read counts within a gene-set was newly developed, and a dozen of currently available RNA-seq enrichment analysis methods were compared, where the proposed methods outperformed others that do not account for the inter-gene correlation. Analysis of real RNA-seq data also supported the proposed methods in terms of false positive control, ranks of true positives and biological relevance. An efficient R package (AbsFilterG- SEA) coded with C++ (Rcpp) is available from CRAN.open

ScholarWorks@UNIST

FigShare

NBLDA: negative binomial linear discriminant analysis for RNA-Seq data

Author: A Oshlack
B Lin
CW Law
D Witten
D Yu
DJ Lorenz
DJ McCarthy
DM Witten
ER Mardis
H Pang
H Wu
Hongyu Zhao
J Li
JC Marioni
JH Bullard
JK Pickrell
JW Lee
Kai Dong
KM Tan
MA Dillies
MD Robinson
MD Robinson
MI Love
O Morozova
S Dudoit
S Huang
SB Montgomery
Tiejun Tong
TJ Hardcastle
WM Landau
Xiang Wan
Y Si
Y Zhou
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data

Author: A Mortazavi
A Oshlack
B Langmead
B Li
B Li
BM Bolstad
C Trapnell
CA Maher
ES Lander
F Li
GK Smyth
GM Church
H Rehrauer
Ho Sun Shon
J Hauke
JC Marioni
JH Schefe
JP Magalhães de
Keun Ho Ryu
M Li
MA Dillies
MD Robinson
MR Teixeira
MY Galperin
P Li
Peipei Li
R Edgar
R Morin
R Patro
S Anders
S Lee
SC Schuster
WB Barbazuk
Y Chu
Y Piao
Yongjun Piao
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

INRIA a CCSD electronic archive server

Life in an arsenic-containing gold mine: genome and physiology of the autotrophic arsenite-oxidizing bacterium rhizobium sp. NT-26

Author: Andres J
Arsène-Ploetze F
Barbe V
Bertin PN
Brochier-Armanet C
Cleiss-Arnold J
Coppée JY
Dillies MA
Geist L
Joublin A
Koechler S
Lassalle F
Marchal M
Muller D
Médigue C
Nesme D
Plewniak F
Proux C
Ramírez-Bahena MH
Santini JM
Schenowitz C
Sismeiro O
Vallenet D
Publication venue
Publication date: 01/01/2013
Field of study

Arsenic is widespread in the environment and its presence is a result of natural or anthropogenic activities. Microbes have developed different mechanisms to deal with toxic compounds such as arsenic and this is to resist or metabolize the compound. Here, we present the first reference set of genomic, transcriptomic and proteomic data of an Alphaproteobacterium isolated from an arsenic-containing goldmine: Rhizobium sp. NT-26. Although phylogenetically related to the plant-associated bacteria, this organism has lost the major colonizing capabilities needed for symbiosis with legumes. In contrast, the genome of Rhizobium sp. NT-26 comprises a megaplasmid containing the various genes, which enable it to metabolize arsenite. Remarkably, although the genes required for arsenite oxidation and flagellar motility/biofilm formation are carried by the megaplasmid and the chromosome, respectively, a coordinate regulation of these two mechanisms was observed. Taken together, these processes illustrate the impact environmental pressure can have on the evolution of bacterial genomes, improving the fitness of bacterial strains by the acquisition of novel functions

HAL Evry

UCL Discovery

HAL Descartes

Spiral - Imperial College Digital Repository

Cis-regulatory evolution spotlights species differences in the adaptive potential of gene expression plasticity

Author: A Corl
A Dobin
A Eyre-Walker
AM Bolger
B Li
BRE Peirson
BY Kim
C Trapnell
CK Ghalambor
CK Ghalambor
CR Landry
CR Reczek
DI Dayan
E Crispo
EK Fischer
EL Koch
F Duveau
F Fyon
F He
F Mallard
FA Cubillos
FA Simão
H Nordberg
I Tirosh
J de Meaux
J de Meaux
J van Gestel
JDV Dyken
JM Baldwin
JR Auld
K Katoh
K Palacio‐López
KA Steige
KL Sikkink
L Salmela
L-M Chevin
LW Ancel
M Haas
M Kelly
M Nordborg
M Pigliucci
M Seki
M Takou
M-A Dillies
MA Koch
MG Grabherr
MI Love
MT Levine
NA Levis
NA Levis
PJ Wittkopp
PJ Wittkopp
R Lande
RN Gutenkunst
S Anders
SS Gill
T Hirayama
T Izawa
T Nussbaumer
TM Mattila
V Susoy
W-C Ho
WJBLAT Kent
X Huang
Y Benjamini
Y Huang
ZH Lemmon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Plasticity allows organisms to respond to environmental change. Here the authors compare the distribution of cis-regulatory variants in the transcriptomes of Arabidopsis lyrata and A. halleriafter exposure to stress, to trace the role of polygenic selection in the evolution of gene expression plasticity

Kölner UniversitätsPublikationsServer

Edinburgh Research Explorer

Gene expression clines reveal local adaptation and associated trade-offs at a continental scale

Author: A Addo-Bediako
A Catalan
A Claridge-Chang
A Lizé
AO Bergland
C Pegueroles
CD Kelly
D Porcelli
DJ Hosken
DJ Obbard
DK Fabian
DK Fabian
DN Fisher
G De Jong
GA Parker
GI Lang
GW Gilchrist
H Li
HB Fraser.
HE Machado
I Yanai
J Balanyá
J Balanyá
J Maynard Smith
J St. Cyr
JA Castro
JA Reinhardt
JK Hill
JL Tompkins
K von Wyschetzki
KJ Gaston
L Fu
L Holman
L O’Donnell
L Zhao
LB Buckley
LT Lancaster
M Boetzer
M Bárbaro
M Kapun
M Kirkpatrick
M Pascual
M Pascual
M Santos
MA Dillies
MAF Noor
MD Robinson
MG Grabherr
MH Gromko
MR Frazier
MT Levine
N Wedell
NH Barton
NW VanKuren
O Savolainen
P Innocenti
P Lankinen
P Simões
PA Parsons
PS Schmidt
PS Schmidt
PW Harrison
R Hilborn
R Kofler
R Lyne
RA Krebs
RA Patty
S Pitnick
S Vanin
S Yeaman
S Yeaman
SC Stearns
TAR Price
TD Wu
TF Mackay
V Kellerman
VR Chintapalli
X Zhao
Y Chen
Z Bochdanovits
Z Kurbalija-Novičić
ZX Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/09/2016
Field of study

Local adaptation, where fitness in one environment comes at a cost in another, should lead to spatial variation in trade-offs between life history traits and may be critical for population persistence. Recent studies have sought genomic signals of local adaptation, but often have been limited to laboratory populations representing two environmentally different locations of a species' distribution. We measured gene expression, as a proxy for fitness, in males of Drosophila subobscura, occupying a 20° latitudinal and 11 °C thermal range. Uniquely, we sampled six populations and studied both common garden and semi-natural responses to identify signals of local adaptation. We found contrasting patterns of investment: transcripts with expression positively correlated to latitude were enriched for metabolic processes, expressed across all tissues whereas negatively correlated transcripts were enriched for reproductive processes, expressed primarily in testes. When using only the end populations, to compare our results to previous studies, we found that locally adaptive patterns were obscured. While phenotypic trade-offs between metabolic and reproductive functions across widespread species are well-known, our results identify underlying genetic and tissue responses at a continental scale that may be responsible for this. This may contribute to understanding population persistence under environmental change